HYPERFLEET-752 | ci: Improve E2E CI Test deployment logic#51
Conversation
WalkthroughAdds centralized debug-log capture and automated cleanup to CLM deploy scripts. Introduces DEBUG_LOG_DIR (default: ${PROJECT_ROOT}/.debug-work) and CLI flag Sequence DiagramsequenceDiagram
actor User
participant DeployScript as Deploy Script
participant Helm
participant Kubernetes
participant DebugLogs as Debug Log Capture
User->>DeployScript: Run install (optional --debug-log-dir)
DeployScript->>Helm: helm upgrade --install <release>-<rand>
alt Helm install succeeds
Helm-->>DeployScript: Release created
DeployScript->>Kubernetes: Run health check probe
alt Health check passes
Kubernetes-->>DeployScript: Healthy
DeployScript-->>User: Installation complete
else Health check fails
Kubernetes-->>DeployScript: Unhealthy
DeployScript->>DebugLogs: capture_debug_logs(namespace, selector, release, DEBUG_LOG_DIR)
DebugLogs->>Kubernetes: kubectl logs/describe/events/workloads/services...
Kubernetes-->>DebugLogs: Diagnostics collected
DebugLogs-->>DeployScript: Logs saved
DeployScript->>Helm: helm uninstall <release>-<rand> -n <ns> --wait --timeout 5m
Helm-->>DeployScript: Uninstall result
DeployScript-->>User: Installation failed
end
else Helm install fails
Helm-->>DeployScript: Install failed
DeployScript->>Helm: helm list -n <ns> --selector adapter-resource-type=...,adapter-name=... -q
alt Matching releases found
Helm-->>DeployScript: Releases listed
DeployScript->>DebugLogs: capture_debug_logs(namespace, selector, release, DEBUG_LOG_DIR)
DebugLogs->>Kubernetes: kubectl logs/describe/events/workloads/services...
DebugLogs-->>DeployScript: Logs saved
DeployScript->>Helm: helm uninstall <matching-release> -n <ns> --wait --timeout 5m
Helm-->>DeployScript: Uninstall result(s)
else No labeled releases
DeployScript->>Helm: fallback: list by release-name prefix
Helm-->>DeployScript: Releases listed
DeployScript->>Helm: helm uninstall <matching-release> -n <ns> --wait --timeout 5m
end
DeployScript-->>User: Installation failed
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deploy-scripts/deploy-clm.sh (1)
445-459: Debug log preservation logic may fail if DEBUG_LOG_DIR is outside WORK_DIR.If a user specifies
--debug-log-dir /var/log/hyperfleet(a path outsideWORK_DIR), this preservation logic would unnecessarily move logs to a temp directory and back. Worse, if the custom path doesn't exist initially, the check passes but subsequent operations might fail.Consider adding a check to only preserve when DEBUG_LOG_DIR is a subdirectory of WORK_DIR:
♻️ Proposed fix
# Clean up work directory (but preserve debug logs) if [[ "${DRY_RUN}" == "false" && "${VERBOSE}" == "false" ]]; then log_verbose "Cleaning up work directory" # Preserve debug logs if they exist - if [[ -d "${DEBUG_LOG_DIR}" ]]; then + if [[ -d "${DEBUG_LOG_DIR}" && "${DEBUG_LOG_DIR}" == "${WORK_DIR}"/* ]]; then local temp_debug_dir temp_debug_dir=$(mktemp -d) mv "${DEBUG_LOG_DIR}" "${temp_debug_dir}/debug-logs" 2>/dev/null || true rm -rf "${WORK_DIR}" mkdir -p "${WORK_DIR}" mv "${temp_debug_dir}/debug-logs" "${DEBUG_LOG_DIR}" 2>/dev/null || true rm -rf "${temp_debug_dir}" else rm -rf "${WORK_DIR}" fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy-scripts/deploy-clm.sh` around lines 445 - 459, The preservation logic for DEBUG_LOG_DIR may move or touch paths outside WORK_DIR; update the cleanup block to first resolve paths (e.g., realpath) and only perform the temp-preserve dance when DEBUG_LOG_DIR exists and its resolved path is under WORK_DIR's resolved path (starts-with check); if DEBUG_LOG_DIR is outside WORK_DIR or doesn't exist, skip moving it and simply rm -rf "${WORK_DIR}" as before; reference the DEBUG_LOG_DIR and WORK_DIR variables and the existing temp_debug_dir usage when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deploy-scripts/deploy-clm.sh`:
- Around line 445-459: The preservation logic for DEBUG_LOG_DIR may move or
touch paths outside WORK_DIR; update the cleanup block to first resolve paths
(e.g., realpath) and only perform the temp-preserve dance when DEBUG_LOG_DIR
exists and its resolved path is under WORK_DIR's resolved path (starts-with
check); if DEBUG_LOG_DIR is outside WORK_DIR or doesn't exist, skip moving it
and simply rm -rf "${WORK_DIR}" as before; reference the DEBUG_LOG_DIR and
WORK_DIR variables and the existing temp_debug_dir usage when making this
change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: de36bfd4-cfa9-4f04-a8bc-5b0f18af6067
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
|
/test lint |
Code reviewFound 1 issues:
hyperfleet-e2e/deploy-scripts/deploy-clm.sh Lines 446 to 460 in 1ae2387 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
1ae2387 to
4558a57
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deploy-scripts/lib/common.sh`:
- Around line 195-201: The debug-log capture currently allows empty WORK_DIR and
ignores failures from directory creation and capture commands; update the logic
around the output_dir and capture steps so failures are propagated: validate
that WORK_DIR (used when computing output_dir) is non-empty before defaulting,
check the return code of mkdir -p for "${output_dir}" and exit non-zero with an
error log via log_section/processLogger on failure, and similarly add explicit
error checks for the capture commands referenced around lines 259-264 so any
failed capture causes the function to log the error and return a non-zero exit
code instead of silently succeeding.
- Around line 202-205: The current timestamp used to build log_prefix (variables
timestamp and log_prefix, referencing output_dir and component_name) has only
second-level granularity and can collide; update timestamp generation to be
collision-resistant by including higher-resolution time and a unique process
identifier (e.g., use date +"%Y%m%d-%H%M%S-%N" and append $$ or similar) or use
a safe unique generator (mktemp/uuid) and then rebuild
log_prefix="${output_dir}/${component_name}-${timestamp}" so concurrent runs
cannot overwrite each other.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3d925e03-58fb-4db5-bc75-505579580ac6
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (4)
- deploy-scripts/deploy-clm.sh
- deploy-scripts/lib/api.sh
- deploy-scripts/lib/sentinel.sh
- deploy-scripts/lib/adapter.sh
4558a57 to
bec5898
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deploy-scripts/deploy-clm.sh`:
- Around line 101-103: The help text for the --debug-log-dir flag is
inconsistent with the actual default
(DEBUG_LOG_DIR="${DEBUG_LOG_DIR:-${PROJECT_ROOT}/.debug-work}"); update the
documented default in the option/usage text for --debug-log-dir to
"${PROJECT_ROOT}/.debug-work" so the help matches the implementation (or
alternatively change the DEBUG_LOG_DIR assignment to use ${WORK_DIR}/debug-logs
if you prefer the documented path); ensure you modify the help string that
references ${WORK_DIR}/debug-logs and keep the flag name --debug-log-dir and
variable DEBUG_LOG_DIR in sync.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 446ca419-15e2-44b0-98da-c957b9c3b2ba
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- deploy-scripts/lib/api.sh
- deploy-scripts/lib/common.sh
- deploy-scripts/lib/sentinel.sh
bec5898 to
e17776f
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deploy-scripts/lib/adapter.sh (1)
110-112: Inefficient/dev/urandomread pattern.
head /dev/urandomreads until a newline character, but/dev/urandomrarely produces newlines, causing it to buffer a large chunk (up to 64KB) before piping totr. This wastes entropy and CPU cycles.♻️ Proposed fix
# Generate random suffix to prevent namespace conflicts local random_suffix - random_suffix=$(head /dev/urandom | LC_ALL=C tr -dc 'a-z0-9' | head -c 8) + random_suffix=$(LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom | head -c 8)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy-scripts/lib/adapter.sh` around lines 110 - 112, Replace the unbounded "head /dev/urandom" pattern used to generate random_suffix with a fixed-byte read from /dev/urandom so you only consume the entropy you need; update the random_suffix assignment (the random_suffix variable initialization) to read a small, fixed number of bytes (e.g., one block) from /dev/urandom and then filter to [a-z0-9] and cut to 8 characters, rather than piping an open-ended head, to avoid buffering a large chunk and wasting CPU/entropy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deploy-scripts/lib/adapter.sh`:
- Around line 110-112: Replace the unbounded "head /dev/urandom" pattern used to
generate random_suffix with a fixed-byte read from /dev/urandom so you only
consume the entropy you need; update the random_suffix assignment (the
random_suffix variable initialization) to read a small, fixed number of bytes
(e.g., one block) from /dev/urandom and then filter to [a-z0-9] and cut to 8
characters, rather than piping an open-ended head, to avoid buffering a large
chunk and wasting CPU/entropy.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7f8e5c39-9c43-4c14-b084-787bdb685d6d
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- deploy-scripts/lib/sentinel.sh
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rafabene The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
00fc4eb
into
openshift-hyperfleet:main
Summary by CodeRabbit
New Features
Improvements
Bug Fixes